NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

GraCFL: A Holistically Designed Vertex-Centric Graph System for CFL Reachability

https://doi.org/10.1145/3721145.3725762

Fuad, Sakib; Sabet, Amir_Hossein Nodehi; Farooq, Umar; Zhao, Zhijia (June 2025, ACM)

Free, publicly-accessible full text available June 8, 2026
PANNS: Enhancing Graph-based Approximate Nearest Neighbor Search through Recency-aware Construction and Parameterized Search

https://doi.org/10.1145/3710848.3710867

Yin, Xizhe; Gao, Chao; Zhao, Zhijia; Gupta, Rajiv (February 2025, ACM)

Free, publicly-accessible full text available February 28, 2026
IncBoost: Scaling Incremental Graph Processing for Edge Deletions and Weight Updates

https://doi.org/10.1145/3698038.3698524

Yin, Xizhe; Zhao, Zhijia; Gupta, Rajiv (November 2024, ACM proceedings)

Full Text Available
Core Graph: Exploiting Edge Centrality to Speedup the Evaluation of Iterative Graph Queries

https://doi.org/10.1145/3627703.3629571

Jiang, Xiaolin; Afarin, Mahbod; Zhao, Zhijia; Abu-Ghazaleh, Nael; Gupta, Rajiv (April 2024, European Conference on Computer Systems (EuroSys), ACM)

Full Text Available
Detecting Potential User-data Save & Export Losses due to Android App Termination

https://doi.org/10.1109/AST58925.2023.00019

Rahaman, Sydur; Farooq, Umar; Neamtiu, Iulian; Zhao, Zhijia (May 2023, 2023 IEEE/ACM International Conference on Automation of Software Test (AST))

Full Text Available
Glign: Taming Misaligned Graph Traversals in Concurrent Graph Processing

https://doi.org/10.1145/3567955.3567963

Yin, Xizhe; Zhao, Zhijia; Gupta, Rajiv (December 2022, Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Full Text Available
JSONSki: streaming semi-structured data with bit-parallel fast-forwarding

https://doi.org/10.1145/3503222.3507719

Jiang, Lin; Zhao, Zhijia (February 2022, Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'22))

Full Text Available
dsJSON: A Distributed SQL JSON Processor

https://doi.org/10.1145/3588957

Saeedan, Majid; Eldawy, Ahmed; Zhao, Zhijia (May 2023, Proceedings of the ACM on Management of Data)

The popularity of JSON as a data interchange format resulted in big amounts of datasets available for processing. Users would like to analyze this data using SQL queries but existing distributed systems limit their users to only two specific formats, JSONLine and GeoJSON. The complexity of JSON schema makes it challenging to parse arbitrary files in a modern distributed system while producing records with unified schema that can be processed with SQL. To address these challenges, this paper introduces dsJSON, a state-of-the-art distributed JSON processor that overcomes limitations in existing systems and scales to big and complex data. dsJSON introduces the projection tree, a novel data structure that applies selective parsing of nested attributes to produce records that are ready for SQL processors. The key objective of the projection tree is to parse a big JSON file in parallel to produce records with a unified schema that can be processed with SQL. dsJSON is integrated into SparkSQL which enables users to run arbitrary SQL queries on complex JSON files. It also pushes projection and filter down into the parser for full integration between the parser and the processor. Experiments on up-to two terabytes of real data show that dsJSON performs several times faster than existing systems. It can also efficiently parse extremely large files not supported by existing distributed parsers
more » « less
Tripoline: generalized incremental graph processing via graph triangle inequality

https://doi.org/10.1145/3447786.3456226

Jiang, Xiaolin; Xu, Chengshuo; Yin, Xizhe; Zhao, Zhijia; Gupta, Rajiv (April 2021, EuroSys '21: Proceedings of the Sixteenth European Conference on Computer Systems)
null (Ed.)
For compute-intensive iterative queries over a streaming graph, it is critical to evaluate the queries continuously and incrementally for best efficiency. However, the existing incremental graph processing requires a priori knowledge of the query (e.g., the source vertex of a vertex-specific query); otherwise, it has to fall back to the expensive full evaluation that starts from scratch. To alleviate this restriction, this work presents a principled solution to generalizing the incremental graph processing, such that queries, without their a priori knowledge, can also be evaluated incrementally. The solution centers around the concept of graph triangle inequalities, an idea inspired by the classical triangle inequality principle in the Euclidean space. Interestingly, similar principles can also be derived for many vertex-specific graph problems. These principles can help establish rigorous constraints between the evaluation of one graph query and the results of another, thus enabling reusing the latter to accelerate the former. Based on this finding, a novel streaming graph system, called Tripoline, is built which enables incremental evaluation of queries without their a priori knowledge. Built on top of a state-of-the-art shared-memory streaming graph engine (Aspen), Tripoline natively supports high-throughput low-cost graph updates. A systematic evaluation with a set of eight vertex-specific graph problems and four real-world large graphs confirms both the effectiveness of the proposed techniques and the efficiency of Tripoline.
more » « less
Full Text Available
Scalable structural index construction for JSON analytics

https://doi.org/10.14778/3436905.3436926

Jiang, Lin; Qiu, Junqiao; Zhao, Zhijia (December 2020, Proceedings of the VLDB Endowment)
null (Ed.)
JavaScript Object Notation (JSON) and its variants have gained great popularity in recent years. Unfortunately, the performance of their analytics is often dragged down by the expensive JSON parsing. To address this, recent work has shown that building bitwise indices on JSON data, called structural indices , can greatly accelerate querying. Despite its promise, the existing structural index construction does not scale well as records become larger and more complex, due to its (inherently) sequential construction process and the involvement of costly memory copies that grow as the nesting level increases. To address the above issues, this work introduces Pison - a more memory-efficient structural index constructor with supports of intra-record parallelism. First, Pison features a redesign of the bottleneck step in the existing solution. The new design is not only simpler but more memory-efficient. More importantly, Pison is able to build structural indices for a single bulky record in parallel, enabled by a group of customized parallelization techniques. Finally, Pison is also optimized for better data locality, which is especially critical in the scenario of bulky record processing. Our evaluation using real-world JSON datasets shows that Pison achieves 9.8X speedup (on average) over the existing structural index construction solution for bulky records and 4.6X speedup (on average) of end-to-end performance (indexing plus querying) over a state-of-the-art SIMD-based JSON parser on a 16-core machine.
more » « less
Full Text Available

« Prev Next »

Search for: All records